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ABSTRACT 

Information Extraction (IE) aims to automatically generate a large 
knowledge base from natural language text, but progress remains 
slow. Supervised learning requires copious human annotation, while 
unsupervised and weakly supervised approaches do not deliver com¬ 
petitive accuracy. As a result, most fielded applications of IE. as 
well as the leading TAC-KBP systems, rely on significant amounts 
of manual engineering. Even “Extreme” methods, such as those 
reported in Freedman et al. HD- require about 10 hours of expert 
labor per relation. 

This paper shows how to reduce that effort by an order of mag¬ 
nitude. We present a novel system, InstaRead, that streamlines 
authoring with an ensemble of methods: 1) encoding extraction 
rules in an expressive and compositional representation, 2) guiding 
the user to promising rules based on corpus statistics and mined 
resources, and 3) introducing a new interactive development cycle 
that provides immediate feedback — even on large datasets. Ex¬ 
periments show that experts can create quality extractors in under 
an hour and even NLP novices can author good extractors. These 
extractors equal or outperform ones obtained by comparably super¬ 
vised and state-of-the-art distantly supervised approaches. 

Categories and Subject Descriptors 

H.2.8 [Database Management]: Database Applications - Data Min¬ 
ing; H.3.1 [Information Storage and Retrieval]: Content Anal¬ 
ysis and Indexing - Linguistic Processing; 1.2.7 [Artificial Intelli¬ 
gence] : Natural Language Processing - Text Analysis; 1.5.5 [Pattern 
Recognition] : Implementation - Interactive Systems 

General Terms 

Experimentation, Human Factors 

Keywords 

information extraction, rule-based extraction, natural language pro¬ 
cessing, interactive systems 


1. INTRODUCTION 

Information Extraction (IE), the process of distilling semantic 
relations from natural language text, continues to gain attention. If 
applied to the Web, such IE systems have the potential to create 
a large-scale knowledge base which would benefit important tasks 
such as question answering and summarization. 

Applying information extraction to many relations, however, re¬ 
mains a challenge. One popular approach is supervised learning 
of relation-specific extractors, but these methods are limited by the 
availability of training data and are thus not scalable. Unsupervised 
and weakly supervised methods have been proposed, but are not 
sufficiently accurate. Many successful applications of IE therefore 
continue to rely on significant amounts of manual engineering. For 
example, the best performing systems of the TAC-KBP slot filling 
challenge make central use of manually created rules (2l[|36| . 

In response, Freedman et al. ED proposed Extreme Extraction, 
a combination of techniques which enabled experts to develop five 
slot-filling extractors in 50 hours, starting with just 20 examples per 
slot type. These extractors outperformed ones learned with manual 
supervision and also required less effort, when data labeling costs 
were included. 

In this work, we seek to dramatically streamline the process of 
extractor engineering, while handling the more general task of bi¬ 
nary relations, r(a, b), where both arguments are free. Our goal 
is to enable researchers to create a high-quality relation extractor 
in under one hour, using no prelabeled data. To achieve this goal, 
we propose an extractor development tool, InstaRead, which de¬ 
fines a user-system interaction based on three key properties. First, 
experts can write compositional rules in an expressive logical nota¬ 
tion. Second, the system guides the expert to promising rules, for 
example through a bootstrap rule induction algorithm which lever¬ 
ages the distribution of the data. Third, rules can be tested instantly 
even on relatively large datasets. 

This paper makes the following contributions: 

• We present InstaRead, an integrated ensemble of methods 
for rapid extractor construction. 

• We show how these components can be implemented to enable 
real-time interactivity over millions of documents. 

• We evaluate InstaRead empirically, showing 1) an expert 
user can quickly create high precision rules with large recall. 
0 that greatly outperform comparably supervised and state-of- 
the-art distantly supervised approaches and require one tenth 
the manual effort of Freedman’s ED approach. We also present 
2) the cumulative gains due to different InstaRead features. 


1 All rule sets developed as part of this work, the training data 
produced by odesk workers, and the output extractions from each 
system are available upon request. 



as well as 3) an error analysis indicating that more than half 
of extractor mistakes stem from problems during preprocess¬ 
ing (e.g., parsing or NER). We further show that 4) even NLP 
novices can use InstaRead to create quality extractors for 
many relations. 

2. PROBLEM DEFINITION 

Engineering competitive information extractors often involves 
the development of a carefully selected set of rules. The rules are 
then used in a number of ways, for example, to create determin¬ 
istic rule-based extractor s |21||36| , or as features or constraints in 
learning-based systems |28[|13| . Typically, however, the develop¬ 
ment of rules is an iterative process of refinement that involves (1) 
analyzing a development corpus of text for variations of relation 
mentions, (2) creating hypotheses for how these can be general¬ 
ized, (3) formulating these hypotheses in a rule language, and (4) 
testing the rules on the corpus. 

Unfortunately, each of the steps in this cycle can be very time 
intensive. For example, when analyzing a corpus an expert may 
spend much time searching for relevant sentences. When creat¬ 
ing hypotheses, an expert may not foresee possible over- or under¬ 
generalizations. An expert's intended generalization may also not 
be directly expressible in a rule language, and testing may be com¬ 
putationally intensive in which case the expert is unable to obtain 
immediate feedback. 

Our goal in this work is to develop and compare techniques 
which accelerate this cycle, so that engineering a competitive ex¬ 
tractor requires less than an hour of expert time. 

3. PRIOR WORK 

Freedman et al. 1 1 I |’s landmark work on ‘Extreme Extraction’ 
first articulated the important challenge of investigating the devel¬ 
opment of information extractors within a limited amount of time. 
Their methods, which allowed an expert to create a question an¬ 
swering system for a new domain in one week, used orders of mag¬ 
nitude less human engineering than the norm for ACI0 TAC-KBP 
and MUCj^j competitions. Key to this improvement was a hybrid 
approach that combined a bootstrap learning algorithm and manual 
rule writing. Freedman et al. showed that this combination yields 
higher recall and FI compared to approaches that used only boot¬ 
strap learning or manual rule writing, but not both. 

Freedman et al.’s task is related but different from our task; some 
of these differences make it harder and others easier. In particu¬ 
lar, Freedman et al. sought to extract relations Ri(argi, arg 2 ), in 
which one of the arguments was fixed. A small amount of training 
data was assumed for each relation, and the task included adapta¬ 
tion of a named entity recognizer and coreference resolution sys¬ 
tem. 

With InstaRead we propose a combination of different, com¬ 
plementary techniques to those of Freedman et al., that focus on 
streamlining rule authoring. One of these techniques leverages a 
refined and very effective bootstrap algorithm that keeps the user 
in the loop, whereas Feedman et al.’s bootstrap learner ran au¬ 
tonomously without user interaction. 

A large amount of other work has looked at bootstrapping ex¬ 
tractors from a set of seed examples. Carlson et al.’s NELL 0 
performs coupled semi-supervised learning to extract a broad set 
of instances, relations, and classes from a Web-scale corpus. Two 
of the four relations we report results for in Section [8] are cov¬ 
ered by NELL, yielding 174 (attendedSchool) and 977 (married) 
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instances after 772 iterations. To avoid a decline in precision (on 
average 57% after 66 iterations), NELL relies on periodic human 
supervision of 5 minutes per relation every 10 iterations. Other 
systems leverage large knowledge bases for supervision. PROS- 
PERA (23) uses MaxSat-based constraint reasoning and improves 
on the results, but requires that relation arguments be contained in 
the YAGO knowledge base (35}. Only one of our four relations 
is extracted by PROSPERA (attendedSchool), yielding 1,371 in¬ 
stances at 78% precision. 26,280 instances of this relation from 
YAGO were used for supervision. DeepDive | [2(?||27[|30) scales a 
Markov logic program to the same corpus and uses Freebase for 
supervision. It reaches an FI score of 31% on the TAC-KBP rela¬ 
tion extraction benchmark. MIML-RE (37) and MultiR fl4) , also 
apply distant supervision from a knowledge base but add global 
constraints to relax the assumption that every matching sentence 
expresses a fact in the knowledge base. Except for the latter two 
systems, which we compared to, none of the above systems is pub¬ 
licly released, making a direct comparison impossible. In general, 
all of the above approaches suffer from relatively low recall and 
combat semantic drift by relying on redundancy, global constraints, 
large knowledge bases, or validative feedback. 

While above approaches attempt to avoid manual input altogether, 
other approaches try to make manual input more effective. These 
include compositional pattern languages for specifying rule-based 
extractors, such as CPSL (T) and TokensRegex (4). Rule-based ex¬ 
traction has also been scaled to larger datasets by applying query 
optimization G30- Unlike our work, this line of research does not 
evaluate the effectiveness of these languages with users, in terms 
of development time and extraction quality. Another approach to 
human input is active learning. Miller et al. |20| learn a Percep- 
tron model for named-entity extraction; unlabeled examples are 
ranked by difference in perceptron score. Riloff (32) proposes an 
approach to named-entity extraction, which requires users to first 
classify documents by domain, and then generates and ranks can¬ 
didate extraction patterns. Active learning has also been studied in 
more general contexts, for learning probabilistic models with la¬ 
beled instances (38) or labeled features ©• This work differs from 
approaches based on active learning in at least two ways. First, they 
have not been evaluated on relation extraction tasks. Second, and 
more importantly, their general approach is to consider a particu¬ 
lar type of feedback and then develop algorithms for learning more 
accurately from such feedback. In contrast, our approach is not 
to compare algorithms, but to compare different types of manual 
feedback. 

4. OVERVIEW OF INSTAREAD 

To evaluate techniques for accelerating the development process 
of rule-based extractors, we developed InstaRead, an interactive 
extractor development tool. InstaRead is designed to address the 
inefficiencies identified in Section [2] with the combination of an 
expressive rule language, data-driven guidance to new rules, and 
instant rule execution. 

We next present an example of an expert interacting with the 
system, and then in the following sections show how InstaRead 
enables its three key properties. 

4.1 Example 

Anna wants to create an extractor for the killed(killer,victim) re¬ 
lation. After selecting a development text corpus she proceeds as 
follows. 

• To find example sentences, Anna searches for sentences con¬ 
taining keyword ‘killed’ (Figure[I]a). InstaRead suggests 




Datasets Knowledge Keywords Rules Settings 


Imurder] 


1 Search 


In what should be a funny sequence ( but is n't), he considers . in turn . 
kidnapping . arson and murder, none of which really interest him . 

After 13 months of investigations . the Suffolk County police and 
prosecutors have named a suspect in the murder of John Starkey . a 25- 
year-old student who is the son of a former aide to Governor Cuomo . 

In court papers filed Tuesday . Steven J. Wilutis . the chief prosecutor for 
the Suffolk County District Attorney's office . charged that the suspect . 
Anthony Romeo of Locust Valley , L.I. , “ has committed the crime of 
murder and that his revolver was the murder weapon ." 

In court papers filed Tuesday . Steven J. Wilutis . the chief prosecutor for 
the Suffolk County District Attorney's office . charged that the suspect. 
Anthony Romeo of Locust Valley , L.l., “ has committed the crime of 
murder and that his revolver was the murder weapon ." 

Mr. Scaring said today that his client had " absolutely “ no involvement in 
the murder. 

Mr. Wilutis told the court that if laboratory analysis of Mr. Romeo's hair 
and blood matched that cauaht in Mr. Starkev's ariD . it would indicate 


Related 

Terms 


Distributionally Similar 


murder 

31740 

kidnapping 

4100 

manslaughter 

2641 

slaying 

2308 

robbery 

6826 

murdering 

1771 

murders 

5130 

assault 

17039 

convicted 

21840 

charged 

47882 

burglary 

1785 

attempted 

9086 

Prosecutors 

5526 

defendant 

8856 

counts 

11806 

stabbing 

1843 


Datasets Knowledge Keywords Rules Settings 


killed(a,b) := 
nsujbj (c, a) &dobj (c f b) &tolcen (c, ' assassinated' ) 


Rule 4 

Save Remove New 

15270 instances 
Materialize Clear Mat 


Sentences Tuples Rules Plan 


killed(a,b) := nsubj(c.a)&dobj(c,b)&token(c.'assassinated') 
killed(a,b) := appos(a,c)&poss(c.b)&token(c.'assassin') 
killed(a.b) := appos(a,c)&'prep-of(c,b)&token(c.'assassin') 
killed(a.b) := rcmod(a,c)&dobj(c.b)&token(c.'assassinated') 
killed(a.b) := dep(a.c)&dobj(c,b)&token(c,'assassinated') 
killed(a.b) := partmod(a.c)&dobj(c.b)&token(c.’assassinated') 
killed(a.b) := rcmod(a.c)&dobj(c.b)&token(c,’gunned’) 

!0 A friend of Yigal Amir, the assassin who gunned down 
Prime Minister Yitzhak Rabin three years ago . was 
sentenced today to nine months in prison for failing to 
prevent the slaying . 

Mq HatShefi hnrn intn a nrnminpnt familv nf 


Collected Examples Library 


killed (killer.victim) 


all 


copy 


bullets that killed... came from ... gun 

test(a.b) := poss(c.a)&'prep- 
from'(d,c)8.token(c,'gun')&nsubj(d,e)&token(d,'ca 
me')&rcmod(e,f)&token(e,'bullets')8>dobj(f,b)8>tok 
en(f.'killed') 

... killed 38 bullets fired at... 

test(a.b) := prep- 

at'(c,a)&partmod(d,c)&token(c,'fired')&dep(e.d)&t 

oken(d,'bullets')&agent(f,e)&token(e,'-38')&nsubj 

pass(f.b)&token(f.'killed') 

... killed... 

test(a.b) := nsubj(c,a)&dobj(c,b)&token(c.'killed') 

shot 15270 

test(a.b) := nsubj(c,a)&dobj(c.b)&token(c.'shot') 

... killed many in massacres earned ... 

test(a.b) := 


Figure 1: Selected interactions (Section [4}. InstaRead guides 
and bootstrap pattern learning (bottom). 

to also consider distributionally similar keywords such as 
‘murder’ or ‘assassin’. Within seconds Anna obtains many 
relevant examples. 

• Anna compares examples, investigating their syntactic struc¬ 
ture obtained by a parser, and encodes an extraction pattern 
as a rule: ki lied (a, b) 4= nsubj(c, a)&dobj(c, b)& 
token(c/murdered'). The system offers to automatically 
generalize that rule, so that it also covers the passive form 
as well as all tenses. 

• Anna now has a working extractor which she would like to 
refine. InstaRead’s bootstrapping method presents her a 
ranked list of new candidate rules based on the extractions of 
her existing rule set. Anna inspects matches of the suggested 
rules and selects several (Figure[I]b). 

• Looking at the rules collected so far, Anna notices that many 
are similar, differing only in the verb that was used. She 


to sentences and rules by distributionally-similar words (top), 


decides to refactor her rule set, so that one rule first identifies 
relevant verbs, and others syntactic structure. Her rule set is 
now more compact and generalizes better. 

5. CONDITION-ACTION RULES IN LOGIC 

A rule language should be both simple and expressive, so that 
the interaction with the system is quick and direct. To fulfill this 
requirement, InstaRead uses condition-action rules expressed in 
first-order logic, combined with a broad and expandable set of built- 
in logical predicates. For tractability, InstaRead requires that 
rules translate into safe domain-relational calculus |39| . 

Although such rules could be used to generate statistical mod¬ 
els (8j, we currently assume that all rules are deterministic and are 
executed in a defined order, and leave the integration with learning- 
based techniques as future work. 

Figure [2] presents an example rule set and how it is applied to 
a sentence. The predicates used in this example have arguments 


users 















that range over tokens and token positions. Rules are used to define 
predicate killOfVictim and that predicate then gets re-used in other 
rules to define predicate killed. We call this ability InstaRead’s 
Composition feature. 

To increase the expressiveness of the language, InstaRead im¬ 
plements around one hundred built-in predicates, such as tokenBe- 
fore and isCapitalized. In addition, it makes available predicates 
that encode the output of (currently) four NLP systems, including 
a phrase structure parser 151, a typed dependency extractor 1 19[j a 
coreference resolution system |29| , and a named-entity tagger pO). 
This allows users to write rules which simultaneously use parse, 
coreference, and entity type information. 


(a) Rules. Dependency predicates nn, poss, nsubjpass, prep-by, 
prep-for, prep-of and named-entity predicate person are pre¬ 
computed. 

killNoun('murder’) 
kill Noun (‘assassination’) 
killNoun(‘killing’) 
kill Noun (‘slaughter’) 

killOfVictim(c, b) 4= prep-of(c, b)Atoken(c, d)AkillNoun(d) 
killOfVictim(c, b) 4= nn(c, b) A token(c, d) A killNoun(d) 
killOfVictim(c, b) 4= poss(c, b) A token(c, d) A killNoun(d) 

ki I led (a, b) 4= person(a) A person(b) A nsubjpass(c, a)A 
token(c, ‘sentenced’) A prep-for(c, d)A 
ki I lOfVicti m (d, b) 

ki I led (a, b) 4= person(a) A person(b) A prep-by(c, a)A 
ki I lOfVicti m (c, b) 


(b) Example sentence with typed dependencies. 


Mr. Williams was sentenced for the murder of Wright 


(c) Predicted ground instances. Here, murder-7 refers to the 7th 
token in the example sentence. 

kill0fVictim(murder-7, Wright-10) 
ki I led (Williams-2, Wright-10) 


Figure 2: Example rule set executed on sentence. 

The rules in Figure[2]are all Horn clauses, but InstaRead also 
supports disjunction (V), negation (—i), and existential (3) and uni¬ 
versal (V) quantification. While one does not often need these op¬ 
erators, they are sometimes convenient for specific lexical ambigu¬ 
ities. For example, in our evaluation discussed in section [8] one 
user of InstaRead created the following rule to extract instances 
of the founded(person,organization) relation: 

founded(a, b) 4=nsubj(c, a) A dobj(c, b) A token(c, ‘built’)A 
person(a) A organization(b) 

This rule was designed to match sentences such as: ‘Michael Dell 
built his first company in a dorm-room.' However, this rule also in- 

4 InstaRead uses collapsed dependencies with propagation of 
conjunct dependencies. 


correctly matches a number of other sentences such as: ‘Mr. Harris 
built Dell into a formidable competitor to IBM.’ While ‘building 
an organization’ typically implies a founded relation, ‘building an 
organization into something’ does not. This distinction can be cap¬ 
tured in our rule by adding the conjunct —i(3d : prep-into(c, d)). 

6. GUIDING EXPERTS TO EFFECTIVE 
RULES 

While the rule language is important for developing extractors, 
many hours of testing on early prototypes of InstaRead showed 
that it is not sufficient for an effective interaction. With a growing 
number of rules, users find it increasingly difficult to identify rules 
for refinement. More importantly, users don’t know where to focus 
their attention when trying to find effective rules to add. 

Feedback. 

Our small example in Figure [2] already demonstrates the prob¬ 
lem: What exactly does each rule do? How much data does each 
rule affect? In early testing, we noticed that users would often 
write short comments for each rule, consisting of the surface tokens 
matched. We therefore designed a technique to automatically gen¬ 
erate such comments (depicted in Figure [3j by retrieving matched 
sentences, identifying sentence tokens that were explicitly refer¬ 
enced by one of the predicates, and concatenating the tokens in 
the order that they appear in a sentence. Included are ‘... ’ place¬ 
holders for the arguments of the rule’s target predicate. Figure [3] 
also shows how InstaRead displays the number of matches to¬ 
gether with each rule, eg. 257 for ‘... stabbed ... ’, helping users 
quickly judge the importance of a rule. Although one may also 
be interested in precision, that cannot be obtained without anno¬ 
tated data. InstaRead also includes visualizations for depen¬ 
dency trees, parse trees, and coreference clusters. Such visualiza¬ 
tions do not always convey all information encoded in the logi¬ 
cal representation, but convey (approximate) meaning or relevance 
quickly. 

massacres')&agent(f,b)&token(f.'carried') 

... accused of slaying of ... 4 

test(a,b) := nsubjpass(c,a)&'prep- 

of(c.d)&token(c,'accused')&'prep- 

of(d.b)&token(d,'slaying') 

... killed ... 

test(a.b) := 

agent(c,a)&nsubjpass(c.b)&token(c.'killed') 

...stabbed... 257 



Figure 3: InstaRead shows automatically generated com¬ 
ments and number of extractions together with each rule, al¬ 
lowing users to see (approximate) meaning and relevance with¬ 
out needing to read logical expressions. 


Bootstrap Ride Induction. 

How does an expert know what rules to write? Coming up with 
good candidates is surprisingly difficult. One approach is auto¬ 
matic rule suggestions based on statistics. This can be done, for 
example, using a semi-supervised bootstrap pattern learning algo- 





rithm. Freedman et al. GD applied such an algorithm, too, but 
found that it was not competitive with manual pattern writing, es¬ 
pecially with regards to recall. InstaRead’s bootstrap algorithm 
therefore makes several changes: First, it instantly returns ranked 
bootstrap results over a large corpus. Second, it takes into account 
coreference information to expand recall (similarly to Gabbard et 
al. |12|). Third, it puts the user into the loop, allowing her to select 
appropriate rules after each iteration. 

In particular, InstaRead’s bootstrap technique takes as input a 
binary relation predicate r (o, b) together with a set of rules R defin¬ 
ing instances of r. The output is a ranked list of candidate rules S. 
The algorithm works by first identifying mentions of r using the 
existing rules R and generating the pairs (a s , b s ) of argument sur¬ 
face strings of these mentions. This set of pairs is then matched to 
the entire corpus, retrieving all sentences containing both strings. 
Similar to DIRT (T8) , InstaRead then generates syntactic-lexical 
extraction patterns from these matches. Loosely following Mintz 
et al. | |22| , the system finds a path of syntactic dependencies con¬ 
necting the matched surface strings, and then creates a rule that is 
a conjunction of syntactic dependencies and lexical constraints on 
that path, as well as entity type constraints (if activated by user). 
For examples, refer to Figure[f] 

Rule suggestions, S, are sorted by two scores: pointwise mutual 
information of the suggested rule with the original rule set R, and 
number of extractions of suggested rule. The latter may show more 
irrelevant rules on top, but the relevant ones among them have many 
extractions often reducing overall effort. Users can switch between 
the sort orders. 

Word-level Distributional Similarity. 

Although our enhancements to the bootstrap approach may in¬ 
crease recall, recall is still limited since bootstrap requires that the 
same tuples appear multiple times in the corpus. To help experts 
find additional relation mentions, InstaRead therefore also in¬ 
cludes another shallow technique: keyword search combined with 
keyword suggestions. 

Keywords are suggested based on distributional similarity to a 
seed keyword. For example, the seed ‘murdered’ returns ‘assas¬ 
sinated’, ‘slayed’, ‘shot’, and more. Specifically, each word w in 
the text corpus is represented as a vector of weighted words v co¬ 
occurring in sentences with w. The similarity of two words u>i, W 2 
is then defined as the cosine similarity of their vector representa¬ 
tions. An additional list of keyword suggestions shows keywords 
which contain the seed keyword as prefix. Suggested keywords are 
always displayed together with their number of occurrences in the 
corpus to guide users to the most relevant keywords. 

Although this keyword-based approach may be effective in find¬ 
ing relevant sentences, early experiments have shown that a long 
time is spent to writing extraction rules based on those sentences. 
We therefore added a simple interface feature: Experts could click 
on words to indicate relation arguments, and the system will gener¬ 
ate rule candidates using our bootstrap generation algorithm. 

Core Linguistic Rules. 

The final problem we address is the fact that a relation extractor 
typically needs a large number of rules that are not specific to the 
relation. For example, there exist many syntactic variations that 
follow common linguistic patterns. To reduce effort, we seek to 
populate the system with such general rules right from the start. 

In a first step, we encoded a set of grammatical rules: Given a 
verb base form, InstaRead can generate rules encoding syntactic- 
lexical patterns for 182 combinations of tense, voice, and person. 
For example, given subject X, object Y and verb ‘kill’, the system 


generates rules to capture phrases such as ‘Y was killed by X’, ‘X 
regretted killing Y’, ‘X would later kill Y’. To avoid inaccuracies 
from using a stemmer, InstaRead includes inflection rules and a 
corpus of inflections for 16851 verbs mined from Wiktionary. This 
grammatical background knowledge is provided to the user through 
a set of additional built-in predicates. 

7. EFFICIENT RUFE EVAFUATION 

To enable its interactivity, InstaRead must evaluate rules and 
guide users to effective rules very quickly, even with compositional 
rules and large datasets. 

InstaRead is built on top of an RDBMS. Variables in its log¬ 
ical expressions are assigned a data type that can be Pos (token 
position). Span (token span), Int (integer), Str (string), or Ref (ref¬ 
erence). Each of these data types is internally mapped to a com¬ 
posite SQL data type. For example, token spans are mapped to the 
SQL types integer, byte, byte, where the first is used to identify 
a sentence and the others indicate start and end positions within a 
sentence. Predicates are either extensional or intensional. Exten- 
sional ones materialize instances in relational tables, while inten¬ 
sional ones are defined by (partial) SQL queries. An example of an 
extensional predicate is our killed (a, b) extractor, which stores the 
result set of the extraction rules depicted in Figure [2] An example 
of an intensional predicate is str2span(s, t) which returns all men¬ 
tions of a multi-word string using an inverted index. For details 
on how this predicate gets translated into SQL, see Figure[5]in the 
appendix. The key component of InstaRead’s implementation is 
its translation of logical rules into SQL queries. The system first 
parses logical rules into an abstract syntax tree (AST). To ensure 
that the rules do not yield infinite result sets and can be translated 
into SQL, it checks for safety )39| . It then infers variable types and 
links predicates, then translates into an AST of tuple relational cal¬ 
culus, and eventually SQL, following the algorithms described in 
(39) . For an example translation, see Appendix A. 

For performance, InstaRead creates a BTree index for each 
column of an extensional predicate. Built-in predicates (which tend 
to contain more instances, e.g. all syntactic dependencies), also use 
multi-column indices. A variety of information is pre-computed on 
a Hadoop cluster, including phrase structure trees, dependencies, 
coreference clusters, named-entities, rule candidates for bootstrap¬ 
ping, and distributionally similar words. 

This large number of indices and pre-computed information is 
important because InstaRead does not constrain the set of queries 
and most queries touch the entire text corpus. It also allows each 
iteration of bootstrapping to be performed by a single SQL query. 
Across all of the experiments the median query execution time was 
74ms. Achieving such interactivity is crucial for quickly building 
accurate extractors. 

8. EXPERIMENTS 

In our evaluation, we measured if InstaRead’s features enable 
an expert to create quality extractors in less than one hour, and 
which of the features contribute most to reducing effort. We also 
report on an error analysis to get insights into potential future im¬ 
provements. Finally, we report early results of a follow-up experi¬ 
ment, in which we evaluated InstaRead’s usability among engi¬ 
neers without NLP background. 

8.1 Experimental Setup 

We evaluated the performance of an expert in a controlled ex¬ 
periment, in which the expert user was given one hour of time per 
relation to develop four relation extractors. Besides descriptions of 


the four relations and a corpus of (unlabeled) news articles, which 
was loaded into InstaRead, no other resources were provided. 
Our expert was familiar with InstaRead and NLP in general, but 
had no experience with the relations tested. All user and system 
actions were logged together with their timestamps. 

We were interested in determining the effectiveness of four of 
InstaRead’s features: bootstrap rule induction (Bootstrap), word- 
level distributional similarity (WordSim), core linguistic rules (Lin¬ 
guistics), and the power of rule (de-)composition (Composition). In 
order to more easily measure the impact of each of these features, 
our user was required to use only one at a time and switch to the 
next at given time intervals (Figure]?}. 

Baselines. 

We compare the performance of the extractors created with our 
proposed system InstaRead to three baselines. 

MIML-RE (37) and MultiR (l4) are two state-of-the-art sys¬ 
tems for learning relation extractors by distant supervision from 
a database. As a database we use the instances of the relations 
contained in Freebase (2). Negative examples are generated from 
random pairs of entity mentions in a sentence]^] 

SUP is a supervised system which learns a log-linear model us¬ 
ing the set of features for relation extraction proposed by Mintz 
et al. (22). The supervision is provided by four annotators hired 
on odesk.com who rated themselves as experts for data entry, 
and were encouraged to use any tool of their choice for annotation. 
Each annotator was asked to spend 1 hour per relation to identify 
sentences in the development corpus containing that relation and 
marking its arguments. To control the variation related to the or¬ 
der in which relations were presented (users get faster with time), 
we used a Latin square design and paid for 1 additional hour before 
the experiment to allow users to get familiar with the task. Negative 
examples were added as in the distantly supervised cases. 

Datasets. 

We used the New York Times Annotated Corpus (34) comprising 
1.8M news articles (45M sentences) published between 1987 and 
2007. A random half of the articles were used for development, the 
other half for testing. 

Relations. 

We selected four relations: attendedSchool (person,school), 
founded (founder,organization), killed (killer,victim), and mar¬ 
ried (spousel ,spouse2). These relations were selected because 
they cover a range of domains, they were part in previous evalua¬ 
tions |[T6||7| 33 ], and they do not require recognition of uncommon 
entity types For preprocessing, we used the CJ Parser 151 and 
Stanford’s dependency (19) , coreference | [29) , and NER |10) sys¬ 
tems. 


5 We added negative examples at a ratio of 50:1 to positives. In¬ 
creasing this ratio increases precision but reduces the number of 
extractions, while decreasing has the opposite effect. We found 
that this setting provided a better trade-off than the default used by 
these distantly supervised systems on the data by Riedel et al. (31) , 
which returned no extractions in our case. 

'TnstaRead's bootstrap rule induction and core linguistic rules 
currently only target binary relations, but not entity types. To iden¬ 
tify named entities of types person and organization, we thus used 
the Stanford NER system. To handle relation attendedSchool we 
additionally created a recognizer for type school by listing 30 com¬ 
mon head words such as ‘University’ before the experiment. This 
process took under 5 minutes. 


attendedSchool founded killed married 

InstaRead (rules) 94 97 141 48 

SUP (examples) 68 79 36 52 


Table 2: Manual input generated in one hour of time. In the 
supervised case, annotators had difficulty finding examples for 
the killed relation which had fewer mentions in the corpus. In 
contrast, InstaRead’s effort-reducing features, such as rule 
suggestions, made it easy to find examples and add relevant 
rules quickly. Our user of InstaRead actually generated more 
rules for this relation, in the allotted time, due to the larger 
number of syntactic variations. 

Metrics. 

Extractions were counted on a mention level, which means that 
an extraction consisted of both a pair of strings representing named 
entities as well as a reference to the sentence expressing the rela¬ 
tion. To measure precision, we sampled 100 extractions and manu¬ 
ally created annotations following the ACE guidelines (7). 

8.2 Comparing InstaRead to Baselines 

Overall results are summarized in Table]!] In the case of Insta¬ 
Read, precision was 90% or highe^jfor each of the four relations, 
and each extractor returned thousands of tuples. For the three base¬ 
lines, results varied between relations but in all cases significantly 
fewer extractions were returned, and in all but two cases precision 
was significantly lower. The most challenging of the four relations 
was killed, since it can be expressed in many different ways, and 
many such expressions have multiple meanings. At the same time, 
mentions of the killed relation occur less frequently in the corpus 
than mentions of the other three relations. The supervised baseline 
did not return results, and the distantly supervised systems could 
not be applied because Freebase did not contain instances for the 
killed relation. 

User Feedback. 

Looking more carefully at the feedback supplied by our users, 
we found that one hour use of InstaRead yielded 95 rules on 
average. This compares to an average of 59 examples per hour 
annotated by users in the supervised case. InstaRead’s effort- 
reducing features made it easy to find relevant sentences and add 
rules quickly, which frequently only required confirmation of a 
system-generated suggestion. Users in the supervised case had dif¬ 
ficulty finding sentences expressing the relations. Two of the anno¬ 
tators reported that they started off reading the text corpus linearly, 
but barely found any examples that way. They later searched by 
keywords (‘College’) and wildcards (‘marr*’). With 77 and 89 ex¬ 
amples per hour these users found more examples (but not neces¬ 
sarily more variations) than users who scanned the corpus linearly 

7 The high precision in the case of InstaRead may seem surpris¬ 
ing, but is in fact easy to attain for many relations. Since ev¬ 
ery change the user makes to the rule set immediately triggers a 
re-evaluation and visual presentation of extractions and their sen¬ 
tences, the user can quickly adapt the rule set until she is satisfied 
with precision on the training set. There is generally little overfit¬ 
ting, due to the training set being large and the rules not being au¬ 
tomatically selected but created by a human with intuitions about 
language. 

This contrasts with SUP, where a fixed feature set leads to high 
precision on one relation (founded), but low precision on another 
(attendedSchool). Without interactive feedback, it is very chal¬ 
lenging to create an effective feature set as well as create effective 
annotated examples, especially negative ones. 







attendedSchool founded killed married 



Pr 

#e 

Pr 

#e 

Pr 

#e 

Pr 

#e 

InstaRead 

100% 

52,338 

91% 

20,733 

90% 

4,728 

90% 

63,742 

MIML-RE 

9% 

14,960 

28% 

14,960 

N/A 

0* 

93% 

9,900 

MultiR 

26% 

18,480 

38% 

10,340 

N/A 

0* 

51% 

24,200 

SUP 

12% 

25,196 

100% 

2,255 

N/A 

0 

44% 

7,867 


Table 1: Precision (Pr) and number of extractions (#e) for the NYTimes test dataset. *Cases where extraction could not be performed 
because no target database could be found that contained examples required for distant supervision. 






Time (min) Time (min) 


Figure 4: Number of extractions on an independent test set while using InstaRead for 55 minutes. Bootstrap (Section[6) captures a 
large number of extractions quickly, but does not yield additional gains after a few minutes. WordSim (Section[6) enables slow, but 
consistent gains. Linguistics (Section [6} provides a small gain. Composition (Section [5| is helpful when there exist a large number of 
lexical stems that imply a relation (e.g. for the killed relation). 


and found 19 and 51 examples per hour. 

Table [2] shows a breakdown by relation, and reveals a striking 
difference between InstaRead and SUP for the killed relation. 
In the supervised case, users were able to identify far fewer exam¬ 
ples for this relation than others. In contrast, our user of Insta¬ 
Read actually generated most rules for this relation. This shows 
that InstaRead did not suffer from the problem of finding exam¬ 
ples. In fact, as we will see, InstaRead’s effort-reducing features 
were actually most effective for this relation, and the larger number 
of rules was necessary to cover a larger set of variations. 

Impact of Effort-Reducing Features. 

Figure [4] shows the contribution of each of the four features on 
number of extractions. The vast majority of extractions, 84%, were 
obtained by rules created during the Bootstrap phase. Bootstrap 
has the ability to aggregate over many potential rules and then rank 
those taking into account the number of extractions. This ranking 
ensures that user effort is directed to rules which are likely to mat¬ 
ter most. Such ranking is not possible with the WordSim feature, 
which, however, has a different advantage: It can find rarely used 


ways of expressing a relation. In contrast, Bootstrap only works if 
the same relation instance is expressed multiple times in different 
ways. We therefore often observe that it provides no more improve¬ 
ment after a few minutes of use. 3.4% of extractions were obtained 
by rules created during the WordSim phase, 2.6% during the Lin¬ 
guistics phase, and 9.5% during the Composition phase. 

Figure [4] further reveals differences between relations. For mar¬ 
ried , the relatively small number of common variations were al¬ 
ready captured in only 15 minutes, after which WordSim, Linguis¬ 
tics, and Composition features provided little benefit. For killed, 
however, each of the four effort-reducing features substantially in¬ 
creased the number of extractions. This shows that InstaRead’s 
ensemble of effort-reducing features was effective in guiding our 
user to the many variations of the killed relation. 

Analysis of InstaRead ’s Errors. 

InstaRead’s precision errors for the four relations were to a 
large degree caused by errors in preprocessing, especially depen¬ 
dency extraction (55%) and NER (24%). Only 21% of precision 
errors were caused by overly general rules that the expert user had 
































developed. All were due to ambiguities of the words fell, executor, 
and built. While the effort-reducing features have been designed to 
increase recall, InstaRead’s focus on only deterministic rules is 
not adequate to easily handle such ambiguities - a shortcoming we 
would like to address in future work. 

Enhancing Supervised Extraction. 

Finally, we are interested in knowing if an increase in time would 
let users in the supervised case match InstaRead’s results. We 
therefore combined the annotations of all four annotators; each re¬ 
lation’s examples thus corresponded to four hours of manual ef¬ 
fort. Trained on this data, SUP returned more extractions (attended- 
School - 51,492, founded - 7,482, killed - 220, married - 24,866), 
but precision remained low and in two cases even decreased slightly 
(attendedSchool - 12%, founded - 97%, killed - 0%, married - 
34%). In summary, additional time does improve performance, but 
many more hours of annotation effort would be required to reach 
performance comparable to InstaRead. 

The features we selected have been shown to work well for many 
relations |22] , but it is still possible that better features could im¬ 
prove the supervised learning algorithm's performance. However, 
feature engineering itself takes considerable effort, usually mea¬ 
sured in weeks, which would defeat our goal of building complete 
extractors quickly. It will be an important area for future work to 
determine if InstaRead can be adapted to support rapid authoring 
of rules that define feature templates, perhaps providing even better 
overall performance on a limited engineering budget. 

Comparing to Extreme Extraction Work. 

It is impossible to compare directly to Freedman et al |1 1) , since 
we were unable to acquire their datasets. While their approach 
yielded an average precision of 53% across 5 relations, they used 
50 hours of manual engineering and furthermore those hours were 
spread across several different experts, each with knowledge of a 
specific tool. 

Unlike InstaRead’s Bootstrap feature, their bootstrap learner 
ran autonomously without user interaction, but contributed little to 
increase overall performance. We suspect that InstaRead’s user 
in the loop, instant execution, integration of coreference informa¬ 
tion, and larger corpus contributed to perceived differences in ef¬ 
fectiveness. Section[3]discusses further differences and similarities 
of Freedman et al's work and InstaRead. 

8.3 Real-world Use By Engineers 

Our experiments so far tested InstaRead’s effectiveness for a 
trained expert; in our final experiment, we evaluated if the system 
was also usable by engineers without NLP background. 

We recruited four senior undergraduate students in Computer 
Science who used InstaRead as part of a quarterly class project 
to develop 30 relation extractors for the TAC-KBP slot filling chal¬ 
lenge. In six meetings, usage of the tool was explained and quali¬ 
tative feedback collected. 

All four subjects were able to use the system with little instruc¬ 
tion, all were able to develop extractors, and all four subjects re¬ 
ported that the tool made it easier for them than if they had to write 
their own code. Among the 27 extractors that were created, median 
precision was 94% (mean 75%), and median number of extractions 
on NYTimes data was 2283 (mean 8741). For two relations, no 
extractor was created due to the difficulty in creating custom entity 
type recognizers, and for one relation due to an implementation er¬ 
ror. Mean precision was negatively affected by six relations which 
required custom entity type recognizers. InstaRead currently has 
no support for developing entity type recognizers, a shortcoming 


which we would like to address in future work. Another important 
area for improvement is the interface to manage sets of rules. The 
subjects found it was often easier for them to manage rule sets in 
code (as strings of logical expression), because they could add their 
own comments, re-arrange, and keep track of multiple versions. 

9. CONCLUSIONS AND FUTURE WORK 

Many successful applications of IE rely on large amounts of 
manual engineering, which often requires the laborious selection 
of rules to be used as extraction patterns or features. 

This paper presents ways to streamline this process, proposing 
an ensemble of methods that enable three properties: an expressive 
rule language, guidance that leads users to promising rules, and 
instant rule testing. Our experiments demonstrate that InstaRead 
enables experts to develop quality relation extractors in under one 
hour - an order of magnitude reduction in effort from Freedman et 
al. (TT). To stimulate continued progress in the area, we release our 
data as explained in footnote 1. 

The experiments also point to two promising directions to further 
reduce manual effort: 

Richness of Interactions. 

With the Bootstrap, WordSim, Linguistics, and Composition fea¬ 
tures, InstaRead offered a variety of interactions, all of which 
contributed to increased recall while maintaining high-precision. 
Bootstrap was particularly effective, but did not allow further im¬ 
provements after a few minutes of use. WordSim did not show this 
problem, but expanded recall more slowly. Composition was very 
effective for some relations. Linguistics yielded smaller gains, but 
required less effort. Future improvements to cover additional syn¬ 
tactic variations, such as participle phrases, may increase gains. 

We consider such variety of interactions essential, and thus plan 
to include interactions for clustering phrases, providing databases 
of instances for distant supervision, editing ontologies, providing 
validative feedback, and annotating sentences. Determining the rel¬ 
ative importance of such interactions will be an important future 
challenge. 

Deep Integration of Algorithms. 

Perhaps even greater potential, however, may lie in more tightly 
integrating InstaRead’s components. Our analysis of precision 
errors revealed that the majority of precision errors were caused by 
inaccurate preprocessing, and we believe that jointly taking into ac¬ 
count manually created rules as well as the k best outputs of the pre¬ 
processing components could improve results. We further suspect 
learning-based techniques may be particularly important for tasks 
such as NER, where there exist many ambiguities, while rule-based 
techniques may work well for tasks such as defining implicature 
between phrases. 

InstaRead’s Boostrap feature could also be improved. It cur¬ 
rently already leverages coreference clusters and syntactic depen¬ 
dencies. In fact, coreference information which greatly increases 
recall may explain much of bootstrap learning’s observed high ef¬ 
fectiveness compared to Freedman et al.’s work. In the future, we 
would like to enable Bootstrap to also take into account our core 
linguistic rules and the ability to decompose rules. Such integra¬ 
tion may expand recall, and interestingly, might also simplify the 
interaction with the user. Since the integrated components enable 
rules with higher coverage, fewer, more distinct rules would be re¬ 
turned. 
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Appendix 

r(t) -4= str2span(‘Lee Harvey Oswald’, s)A 
span2pos(s, p) A nsubj(c, p)A 
token(c, t) 

I translation 


SELECT ti4.tokenID 
FROM tokenlnst tiO, tokenlnst til, 
tokenlnst ti2, tokenlnst ti3, 
dependencylnst diO, tokenlnst ti4 
WHERE ti4.offset = diO.from 

AND ti4.sentencelD = diO.sentencelD 

AND diO.to = ti3.offset 

AND diO.sentencelD = ti3.sentencelD 

AND diO.dependencylD = 11 

AND ti3.offset < ti2.offset + 1 

AND ti3.offset >= tiO.offset 

AND ti3.sentencelD = tiO.sentencelD 

AND ti2.tokenID = 79216 

AND til.tokenID = 6058 

AND tiO.tokenID = 5322 

AND tiO.offset + 2 = ti2.offset 

AND tiO.sentencelD = ti2.sentencelD 

AND tiO.offset + 1 = til.offset 

AND tiO.sentencelD = til.sentencelD 


Figure 5: Translation of a (safe) expression in first-order logic 
to SQL. The expression returns verbs for which Lee Harvey Os¬ 
wald appears as subject. str2spans and span2pos are inten- 
sional predicates, nsub j and token are extensional. Each pred¬ 
icate gets translated into a fragment of SQL; the fragments are 
then combined into a single SQL query, which can be efficiently 
executed. 


killOfNom(c, b) <= . . . 

. . . prep-of(c, b) A token(c, ‘assassination’) 
. . . prep-of(c, b) A token(c, ‘execution’) 

. . . prep-of(c, b) A token(c, ‘felling’) 

. . . prep-of(c, b) A token(c, ‘killing’) 

. . . prep-of(c, b) A token(c, ‘shooting’) 

. . . prep-of(c, b) A token(c, ‘slaughter’) 

. . . prep-of(c, b) A token(c, ‘slaying’) 

. . . prep-of(c, b) A token(c, ‘stabbing’) 

. . . prep-of(c, b) A token(c, ‘murder’) 

. . . nn(c, b) A token(c, ‘assassination’) 

. . . nn(c, b) A token(c, ‘murder’) 


lerRole(c) •<= . . . 
token(c, ‘assassin’) 
token(c, ‘murderer’) 


ngBNF(c, b) <= . . . 
dobj(c, b) A token(c, ‘assassinate’) 
dobj(c, b) A token(c, ‘murder’) 


ngBlnf(d, b) . . . 
dobj(d, b) A token(d, ‘assassinating’) 
dobj(d, b) A token(d, ‘murdering’) 


led(a, b) <= person(b) A person(a) A (a ^ b) A . . . 
actlnd(a, c, ‘confess’) A prepc-to(c, d) A killingBInf(d, b) 
actlnd(a, c, ‘confess’) A prep-to(c, d) A killOfNom(d, b) 
actlnd(a, c, ‘assassinate’) A dobj(c, b) 
actlnd(a, c, ‘murder’) A dobj(c, b) 


agent(c, a) A partmod(b, c) A token(c, ‘assassinated’) 
agent(c, a) A partmod(b, c) A token(c, ‘murdered’) 


agent(c, a) A rcmod(b, c) A token(c, ‘assassinated’) 
agent(c, a) A rcmod(b, c) A token(c, ‘murdered’) 

appos(a, c) A poss(c, b) A killerRole(c) 
appos(a, c) A prep-of(c, b) A killerRole(c) 
appos(c, a) A poss(c, b) A killerRole(c) 
appos(c, a) A prep-of(c, b) A killerRole(c) 
dep(a, c) A dobj(c, b) A token(c, ‘assassinated’) 
dep(a, c) A dobj(c, b) A token(c, ‘murdered’) 
infmod(a, c) A killingBNF(c, b) 

nsubj(c, a) A prep-in(c, d) A token(c, ‘suspect’) A killOfNom(d, b) 
nsubj(c, a) A xcomp(c, d) A killingBInf(d, b) 
partmod(a, c) A killingBlnf(c, b) 

partmod(a, c) A prepc-for(c, d) A token(c, ‘sentenced’) A killingBInf(d, b) 
partmod(a, c) A prepc-of(c, d) A token(c, ‘accused’) A killingBlnf(d, b) 
partmod(a, c) A prepc-of(c, d) A token(c, ‘convicted’) A killingBInf(d, b) 
partmod(a, c) A prepc-with(c, d) A token(c, ‘charged’) A killingBlnf(d, b) 

partmod(a, c) A prep-in(c, d) A token(c, ‘wanted’) A prep-with(d, e) A token(d, ‘connection’) A 
killOfNom(e, b) 

partmod(a, c) A prep-to(c, d) A token(c, ‘linked’) A killOfNom(d, b) 
passlnd(a, c, ‘accuse’) A prep-of(c, d) A killOfNom(d, b) 
passlnd(a, c, ‘charge’) A prepc-with(c, d) A killingBInf(d, b) 
passlnd(a, c, ‘charge’) A prep-with(c, d) A killOfNom(d, b) 
passlnd(a, c, ‘convict’) A prepc-for(c, d) A killingBInf(d, b) 
passlnd(a, c, ‘convict’) A prepc-of(c, d) A killingBlnf(d, b) 

passlnd(a, c, ‘convict’) A prepc-of(c, d) A prep-in(d, e) A token(d, ‘taking’) A killOfNom(e, b) 

passlnd(a, c, ‘convict’) A prep-for(c, d) A killOfl\lom(d, b) 

passlnd(a, c, ‘convict’) A prep-in(c, d) A prep-of(d, b) A token(d, ‘death’) 

passlnd(a, c, ‘convict’) A prep-of(c, d) A killOfNom(d, b) 

passlnd(a, c, ‘link’) A prep-to(c, d) A killOfNom(d, b) 

passlnd(a, c, ‘sentence’) A prep-for(c, d) A killOflMom(d, b) 

passlnd(a, c, ‘want’) A prep-in(c, d) A prep-with(d, e) A token(d, ‘connection’) A killOfNom(e, b) 
passlnd(b, c, ‘assassinate’) A agent(c, a) 

passlnd(b, c, ‘gun’) A prt(c, d) A token(d, ‘down’) A agent(c, a) 
passlnd(b, c, ‘murder’) A agent(c, a) 

passlnd(l, c, ‘take’) A token (I, ‘life’) A poss(l, b) A agent(c, a) 
passlnd(l, c, ‘take’) A token (I, ‘life’) A prep-of(l, b) A agent(c, a) 

poss(c, a) A killOfNom(c, b) 

poss(c, a) Ansubjpass(d, c)Atoken(c, ‘name’) Aprep-to(d, e)Atoken(d, ‘linked’) AkillOfNom(e, b) 

prep-by(c, a) A killOfNom(c, b) 

rcmod(a, c) A dobj(c, b) A token(c, ‘assassinated’) 

rcmod(a, c) A dobj(c, b) A token(c, ‘murdered’) 


token(a, ‘suspect’) A nsubj(c, a) A prep-in(c, d) A prep-of(d, b) A nn(e, d) Atoken(e, ‘murder’) A 

tokened, ‘trial’) 

xsubj(c, a) A killingBNF(c, b) 


Figure 6: Selected extraction rules created for relation killed 
during the experiment. Many extractions were obtained dur¬ 
ing the Bootstrap phase, which suggested rules combining syn¬ 
tactic dependencies (eg. nn) and lexical information (eg. token). 
Users selected from these suggestions, but also adapted them 
by adding constraints (eg. prt, 7^, person). WordSim added lex¬ 
ical variety (eg. killOfNom), and Linguistics covered additional 
verb inflections (encoded by predicates actlnd and passlnd). 
Composition introduced re-usable components (eg. killerRole, 
killingBNF). 






